Semantics graphs for enterprise communication networks

ABSTRACT

Building a semantics graph for an enterprise communication network can include calculating a distance metric between a first signifier and a second signifier associated with an enterprise communication network, wherein the distance metric includes a plurality of relationships defined based on a frequency of co-occurrences of the first signifier and the second signifier, and building a semantics graph for the enterprise communication network using the calculated distance metric.

PRIORITY INFORMATION

This application is a Continuation of U.S. application Ser. No.13/755,556, filed Jan. 31, 2013, the entire contents of which areincorporated herein by reference in its entirety.

BACKGROUND

Crawling and retrieval of web content can include browsing the WorldWide Web in a methodical and/or orderly fashion to create a copy ofvisited pages for later processing by a search engine. However, due tothe current size of the Web, search engines cannot index the entire Web.

Prior approaches to crawling and retrieving web content include the useof focused web crawlers. A focused web crawler estimates a probabilityof a visited page being relevant to a focus topic and retrieves a linkcorresponding to the page only if a target probability is reached;however, a focus web crawler may not retrieve a sufficient number oflinks or sufficiently relevant links. For example, a focus web crawlercan download only a fraction of Web pages visited.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram illustrating an example of a method forbuilding a semantics graph for an enterprise communication networkaccording to the present disclosure.

FIG. 2 is a flow chart illustrating an example of a process for buildinga semantics graph for an enterprise communication network according tothe present disclosure.

FIG. 3 illustrates an example of a system according to the presentdisclosure.

DETAILED DESCRIPTION

An enterprise may use an enterprise network, such as a cloud systemand/or Internet network, to distribute workloads. An enterprise network,as used herein, can include a network system to offer services to usersof the enterprise (e.g., employees and/or customers). A service, as usedherein, can include an intangible commodity offer to users of a network.Such services can include computing resources (e.g., storage, memory,processing resources) and/or computer-readable instructions (e.g.,programs). A user may benefit from another user's experience with aparticular service. However, due to the distributed nature of anenterprise network, users may have difficulty in sharing knowledge, suchas services experiences.

In some situations, an enterprise may use an enterprise communicationnetwork to assist users of an enterprise network in sharing knowledge,learning from other users' services experiences, and searching forcontent relevant to the enterprise and/or the enterprise network. Theenterprise communication network, as used herein, can include anelectronic communication network to connect users of the network torelevant content. Users of the enterprise communication network cancontribute to the enterprise communication network through a range ofactivities such as posting service-related entries, linking entries tocontent available on internal and external domains, reading comments,commenting on comments, and/or voting on users' entries. Thereby, theenterprise communication network can act as a social network associatedwith the enterprise, services offered by the enterprise, and/ordocuments associated with the enterprise, among other topics.

However, the range of activities that users can contribute to anenterprise communication network can result in the enterprisecommunication network containing unstructured content. Due to theunstructured nature of the content, a general purpose search engine maynot properly function to allow users to search for content in theenterprise communication network. General purpose search engines mayutilize measures such as back-links and/or clicks to define a qualityand reputation of searched content. In an enterprise communicationnetwork, the quality and reputations of content may not be proportionalto the number of back-links and/or clicks.

In contrast, in examples of the present disclosure a relatedness ofcontent within the enterprise communication network can be identified byautomatically learning semantics of signifiers within the enterprisecommunication network and/or the enterprise network. The signifiers canbe identified by gathering content using a search tool and extractingsignifiers from the gathered content. A relatedness of the identifiedsignifiers can be defined by calculating a distance metric between pairsof signifiers. Using the defined distance metric, a semantics graph canbe built that identifies the proximity of relations between thesignifiers. A semantics graph can assist in tagging and searching forcontent within the enterprise communication network.

Examples of the present disclosure may include methods, systems, andcomputer-readable and executable instructions and/or logic. An examplemethod for building a semantics graph for an enterprise communicationnetwork can include calculating a distance metric between a firstsignifier and a second signifier associated with an enterprisecommunication network, wherein the distance metric includes a pluralityof relationships defined based on a frequency of co-occurrences of thefirst signifier and the second signifier, and building the semanticsgraph for the enterprise communication network using the calculateddistance metric.

In the following detailed description of the present disclosure,reference is made to the accompanying drawings that form a part hereof,and in which is shown by way of illustration how examples of thedisclosure may be practiced. These examples are described in sufficientdetail to enable those of ordinary skill in the art to practice theexamples of this disclosure, and it is to be understood that otherexamples may be utilized and the process, electrical, and/or structuralchanges may be made without departing from the scope of the presentdisclosure.

The figures herein follow a numbering convention in which the firstdigit or digits correspond to the drawing figure number and theremaining digits identify an element or component in the drawing.Similar elements or components between different figures may beidentified by the use of similar digits. Elements shown in the variousexamples herein can be added, exchanged, and/or eliminated so as toprovide a number of additional examples of the present disclosure.

In addition, the proportion and the relative scale of the elementsprovided in the figures are intended to illustrate the examples of thepresent disclosure, and should not be taken in a limiting sense. As usedherein, “a number of” an element and/or feature can refer to one or moreof such elements and/or features.

FIG. 1 is a block diagram illustrating an example of a method 100 forbuilding a semantics graph for an enterprise communication networkaccording to the present disclosure. The method 100 can be used to builda semantics graph for an enterprise communication network containing aplurality of signifiers.

An enterprise communication network, as used herein, can include anetwork connecting a plurality of users and content through a range ofactivities. The activities can be related to a services network of theenterprise (e.g., enterprise network). For example, the activities caninclude posting service-related entries, linking entries to internalenterprise domains and/or external domains, and/or reading, commenting,and/or voting on other user's entries. In various examples of thepresent disclosure, the enterprise communication network can be asub-portion and/or contained within the enterprise network.

A semantics graph, as may be built using the method 100, can allow usersof the enterprise communication network to search for content within theenterprise communication network. A general purpose search engine maynot be able to search for content in the enterprise communicationnetwork given the unstructured nature of the content. Such a searchengine may function by defining a quality and reputation of content(e.g., domains) based on a number of back-links (e.g., links from othercontent) and/or clicks by a user. However, content in the enterprisecommunication network may not have proportional back-links and/or clicksto the quality and/or reputation of the content. In some instances,content in the enterprise communication network may not have measureableback-links and/or clicks (e.g., email). In order to search contentwithin the enterprise communication network, semantics of signifierswithin the enterprise network may be automatically learned. Forinstance, automatically learning the semantics of signifiers can includebuilding a semantics graph to identify proximity of signifiers using themethod 100.

At 102, the method 100 for building a semantics graph for an enterprisecommunication network can include calculating a distance metric betweena first signifier and a second signifier associated with the enterprisecommunication network, wherein the distance metric includes a pluralityof relationships defined based on a frequency of co-occurrences of thefirst signifier and the second signifier. For instance, the plurality ofrelationships can be based on a frequency of co-related services, afrequency of co-related phrases, and an average location of the firstsignifier and the second signifier (as discussed further herein).

A signifier, as used herein, can include a word, phrase, and/or acronymwithin the content of the enterprise network and/or the enterprisecommunication network. The signifiers can be gathered, in variousexamples, using search tools (e.g., web crawlers) and extraction tools(e.g., extractors) (as discussed further herein). A signifier associatedwith the enterprise communication network can include a signifiergathered from the enterprise network and/or the enterprise communicationnetwork.

A distance metric, as used herein, can include a numerical scorecalculated. The numerical score can represent the proximity of relationbetween a first signifier and a second signifier. For instance,calculating the distance metric can include calculating a weightedEuclidean distance including constructing an n-dimensional featurevector. A Euclidean distance can include an ordinary distance (e.g.,numerical description of a distance) between two points. The distancemetric can be based on a plurality of criteria to construct then-dimensional feature vector. Such criteria can be based on a frequencyof co-occurrences of the first signifier and the second signifier in theenterprise network and/or the enterprise communication network (e.g., aplurality of relationships). Examples of co-occurrences can include thefirst signifier and the second signifier in the same list, table,paragraph, and/or linked content (e.g., domains), among otherco-occurrences.

At 104, the method 100 for building a semantics graph for an enterprisecommunication network can include building the semantics graph for theenterprise communication network using the calculated distance metric. Asemantics graph, as used here, can include a data structure representingconcepts that are related to one another. The concepts can includelanguage (e.g., words, phrases, acronyms), for instance. The semanticsgraph can include a plurality of nodes connected by a plurality ofedges. A node can include a vertex representing a signifier. The edgescan connect related signifiers. Each edge can be weighted with the scoredefined by the calculated distance metric between pairs of relatedsignifiers (e.g., the first signifier and the second signifier).Weighting an edge with a score, as used herein, can include associatingthe score with the edge connecting a pair of related signifiers.

For instance, the method 100 can include adding the first signifier andthe second signifier as nodes on the semantics graph with an edgeconnecting the first signifier and the second signifier. The edgeconnecting the first signifier and the second signifier can be weightedwith a score defined by the calculated distance metric, in variousexamples.

FIG. 2 is a flow chart illustrating an example of a process 210 forbuilding a semantics graph for an enterprise communication networkaccording to the present disclosure.

At 212, the process 210 can include gathering content. For instance, asearch tool can gather content from the enterprise network and/or theenterprise communication network. A search tool, as used herein, caninclude hardware components and/or computer-readable instructioncomponents designated and/or designed to scan the enterprise networkand/or the enterprise communication network to collect data. Forinstance, the search tool can search the enterprise network for theplurality of signifiers (e.g., words, phrases, and/or acronyms). Thedata can include documents and/or data associated with the enterprisecommunication network and/or the enterprise network. Such data caninclude Hypertext Markup Language (HTML) content, email communications,and/or other documents (e.g., SharePoint documents).

In various examples the present disclosure, a repository builder cangather the content and build a repository with the gather content. Arepository builder can include hardware components and/orcomputer-readable instruction components designated and/or designed tobuild a repository. A repository can include a source storage system.For example, a repository can include a file folder and/or shareddirectory. The repository may store the gathered content, for instance.

At 214, the process 210 can include extracting signifiers. Signifierscan be extracted from the content gathered (e.g., at 212). For instance,an extraction tool may extract the signifiers. An extraction tool caninclude hardware components and/or computer-readable instructioncomponents that extract information from an unstructured and/orsemi-structured structure (e.g., the content gathered).

The extracted signifiers can include a plurality of words, phrases,and/or acronyms extracted through pattern recognition techniques. Forinstance, with HTML content, signifiers can be located in the title,lists, links, tables, paragraphs, and/or linked content (e.g., domains).The pattern recognition technique used by an extraction tool canidentify the location and/or format of the title, lists, links, andtables on the HTML document and extract their members as signifiers.

At 216, the process 210 can include calculating (e.g., determining)distance metrics for related signifiers. A distance metric for relatedsignifiers can include a calculated distance metric between a firstsignifier and a second signifier. The process 210 can be used to definea set of proximities (e.g., distance metrics) of the plurality ofsignifiers as extracted (e.g., at 214).

For instance, as illustrated in the example of FIG. 2, calculating adistance metric between a first signifier and a second signifier caninclude calculating a ratio of co-related services associated with thefirst signifier and the second signifier (e.g., related signifiers)associated with an enterprise communication network, calculating a ratioof co-related phrases associated with the first signifier and the secondsignifier, and averaging (e.g., median) a location of the firstsignifier and the second signifier on the enterprise network (e.g., aplurality of relationships defined based on a frequency ofco-occurrences). The sum of the services ratio, the phrases ratio, andthe average location can include the distance metric between the firstsignifier and the second signifier.

Calculating a ratio of co-related services associated with relatedsignifiers can include:

${{d_{1}\left( {u,v} \right)} = \frac{\sum\left( {{services}\mspace{14mu} {related}\mspace{14mu} {to}\mspace{14mu} {both}\mspace{14mu} u\mspace{14mu} {and}\mspace{14mu} v} \right)}{\sum\left( {{{services}\mspace{14mu} {related}\mspace{14mu} {to}\mspace{14mu} u} + {{services}\mspace{14mu} {related}\mspace{14mu} {to}\mspace{14mu} v}} \right)}},$

wherein the calculated ratio d₁ (u, v) can include a sum of servicesrelated to both the first signifier u and the second signifier v dividedby a sum of services related to the first signifier u plus servicesrelated to the second signifier v. Related services, as used herein, caninclude a service that references a signifier (e.g., u or v). Servicesrelated to both signifiers u, v can include domains and/or documentsassociated with a service that contains both signifiers u and v.Services related to u can include services related to the firstsignifier but not related to the second signifier (e.g., services thatreference u but do not reference v). Services related to v can includeservices related to the second signifier but not related to the firstsignifier (e.g., services that reference v but do not reference u). Inother words, the denominator in the ratio of d₁ (u, v) can include a sumof independent services (e.g., related to u independent of v and relatedto v independent of u).

In various examples of the present disclosure, determining servicesrelated to a first signifier u and a second signifier v can includedetermining services each signifier (e.g., u and v) is related to.Determining services related to a signifier can include calculating adistance from a service domain to a domain retrieved by the search tool(e.g., web crawler) that contains the signifier (e.g., u or v). Theservice domain (e.g., web page) can include an Internet page that is themain location of the service. The domain retrieved can include anInternet page that the signifier is located on. The distance from theservice domain to the retrieved domain can include a number of linksfrom the service domain to the retrieved domain. In some instances,there may be multiple paths (e.g., sequence of links) for a user to gofrom the retrieved domain to the service domain and/or vice versa. Thedistance, in such an instance, can include the path with the lowestnumber of links among the multiple paths. Thereby, each signifier canhave a vector of distances between the retrieved domain and each serviceline. Related services to a signifier (e.g., first signifier u) can bebased on retrieved domains the signifier appears on and a vector ofdistances from the retrieved domains. For instance, a related servicecan include a service with a distance between a retrieved domain and theservice domain that is below a threshold distance.

The denominator in the ratio of d₁ (u, v) can include a normalizationfactor. In addition, the numerator can include a monotonicallydecreasing function and the denominator can include a monotonicallyincreasing function. A monotonic function can include a function betweenordered sets that preserves the order. A monotonically decreasingfunction can include a function wherein the Y-axis decreases (e.g., thedistance metric) as the X-axis increases (e.g., sum of services relatedto both u and v). A monotonically increasing function can include afunction wherein the Y-axis increases (e.g., distance metric) as theX-axis decreases (e.g., sum of services related to u plus the servicesrelated to v). Thereby, a distance metric for a first signifier and asecond signifier can be smaller than a distance metric between a thirdsignifier and a fourth signifier in response to identifying the firstsignifier and the second signifier relate to a service (e.g., and thethird signifier and fourth signifier do not).

Calculating a ratio of co-related phrases associated with relatedsignifiers can include:

d ₂(u, v)=a−s(u, v).

Alpha can denote a numerical value that remains constant. For instance,in some examples, alpha can be limited to a constant numerical valuethat is greater than the max of s(u, v). As used herein, s(u, v) candenote common phrases between a first signifier u and a second signifierv. For instance, s(u, v) can include a ratio of a sum of words common toboth u and v divided by a sum of the number of words in u plus thenumber of words in v (e.g., the total phrases of u and v). For example,s(u, v) can be defined as:

${s\left( {u,v} \right)} = {\frac{\sum\left( {{words}\mspace{14mu} {common}\mspace{14mu} {to}\mspace{14mu} u\mspace{14mu} {and}\mspace{14mu} v} \right)}{\sum\left( {{{number}\mspace{14mu} {of}\mspace{14mu} {words}\mspace{14mu} {in}\mspace{14mu} u} + {{number}\mspace{14mu} {of}\mspace{14mu} {words}\mspace{14mu} {in}\mspace{14mu} v}} \right)}.}$

Thereby, a distance metric for a first signifier and a second signifiercan be smaller than a distance metric between a third signifier and afourth signifier in response to identifying the first signifier and thesecond signifier have co-related words (e.g., and the third signifierand fourth signifier do not).

Calculating an average of the location (e.g. distance) between relatedsignifiers can include:

d ₃(u, v)=median(location between u and v).

The average location between u and v can include a median of thelocation distances between u and v on an HTML domain. For example, d3(u, v) can be defined by a plurality of criteria. The criteria caninclude rules. An example of the plurality of rules can include:

a. if u and v appear on linked domains than the distance between u and vmay be smaller than if they do not.

b. if u and v appear on the same (e.g., identical) domain, than thedistance between u and v may be smaller than if they do not and smallerthan (a).

c. if u and v appear in the same (e.g., identical) sub-portion (e.g.,table, list, paragraph) of the same domain, than the distance between uand v may be smaller than if they do not and smaller than (a) and (b).

Thereby a mathematical representation of the rules can include distancea>distance b>distance c. A smaller distance can indicate signifiers aremore related than a larger distance, for instance. Although the presentexample illustrates the average location as a median of the location,examples in accordance with the present disclosure are not so limited.An average location can include a mean, a geometric mean, an averagepercentage, and/or a mode, among other averaging techniques.

As an example of calculating d3, four signifiers may be identified on anenterprise network and/or an enterprise communication network. The firstsignifier u may be related to the second signifier v and may be locatedon HTML domains linked together. The second signifier v may be relatedto the third signifier w but not located on linked HTML domains. Thefirst signifier u and the third signifier w may be unrelated. The thirdsignifier w may be related to the fourth signifier y and may be found onthe same HTML domain. The first signifier u may be related to the fourthsignifier y and may be found on the same table and/or list on the sameHTML domain. The second signifier v and the fourth signifier y may befound to be unrelated. The distance metrics associated with the foursignifiers (e.g., u, v, w, y) can be summarized as:

d ₃(u, y)<d ₃(w, y)<d ₃(u, v)<d ₃(v, w)<d₃(v, y), d ₃(u, w).

The distance metric can be denoted by, for example:

d(u, v)=d ₁(u, v)+d ₂(u, v)+d ₃(u, v),

and can be calculated for each subset of related signifiers (e.g., eachpair of related signifiers). Thereby, a plurality of distance metricscalculated for a plurality of related signifiers can include a set ofproximities between the plurality of signifiers.

At 218, the process 210 can include building a semantics graph for theenterprise communication network. The semantics graph can be built usingthe calculated distance metric between the first signifier and thesecond signifier. In various examples of the present disclosure, thesemantics graph can include the defined distance metrics of theplurality of pairs of related signifiers. The set of proximities can berepresented (e.g., added to the semantics graph) as edges between thenodes as defined by the distance metrics between related signifiers. Theset of edges includes a set of proximities of the plurality ofsignifiers as defined by distance metrics between pairs relatedsignifiers.

The process 210 can utilize a semantics builder 220 for calculatingdistance metrics for related signifiers (e.g., 216) and/or building thesemantic graph (e.g., 218). The semantics builder 220 can includehardware components and/or computer-readable instruction componentsdesignated and/or designed to build a semantics graph associated withthe enterprise communication network. For instance, the semantics graphcan include the set of signifiers as nodes with a set of proximitiesbetween the set of signifiers. The set of proximities can be represented(e.g., added to the semantics graph) as edges between the nodes asdefined by the distance metrics between related signifiers.

FIG. 3 illustrates a block diagram of an example of a system 322according to the present disclosure. The system 322 can utilizesoftware, hardware, firmware, and/or logic to perform a number offunctions.

In a number of examples, the system 322 can be any combination ofhardware (e.g., one or more processing resource 324, computer-readablemedium (CRM), etc.) and program instructions (e.g., computer-readableinstructions (CRI)) configured to build a semantics graph for anenterprise communication network. A processing resource 324, as usedherein, can include any number of processors capable of executinginstructions stored by a memory resource 328. Processing resource 324can be integrated in a single device or distributed across devices.

The memory resource 328 can be in communication with a processingresource 324 (e.g., one or more processing devices). For instance, theprocessing resource 324 can be in communication with a tangiblenon-transitory CRM (e.g., memory resource 328) storing a set of CRIexecutable by the processing resource 324, as described herein. The CRIcan also be stored in remote memory managed by a server and represent aninstallation package that can be downloaded, installed, and executed.The system 322 can include memory resource 328, and the processingresource 324 can be coupled to the memory resource 328. Further, memoryresource 328 may be fully or partially integrated in the same device asprocessing resource 324 or it may be separate but accessible to thatdevice and processing resource 324. Thus, it is noted that the system322 may be implemented on a user and/or a client device, on a serverdevice and/or a collection of server devices, and/or on a combination ofthe user device and the server device and/or devices.

Processing resource 324 can execute CRI that can be stored on aninternal or external memory resource 328. The processing resource 324can execute CRI to perform various functions, including the functionsdescribed with respect to FIG. 1 and FIG. 2.

The CRI can include a number of modules 330, 332, 334. The number ofmodules 330, 332, 334 can include CRI that when executed by theprocessing resource 324 can perform a number of functions.

The modules 330, 332, 334 can be sub-modules of other modules. Forexample, the distance metric module 332 and the build semantics graphmodule 334 can be sub-modules and/or contained within the same computingdevice. In another example, the modules 330, 332, 334 can compriseindividual modules at separate and distinct locations (e.g., CRM, etc.).

An extract module 330 can include CRI that when executed by theprocessing resource 324 can provide a number of extraction functions.The extract module 330 can extract a plurality of signifiers from anenterprise network and/or an enterprise communication network using anextraction tool.

In various examples of the present disclosure, the system 322 caninclude a search module (not illustrated in the example of FIG. 3). Thesearch module can include CRI that when executed by the processingresource 324 can provide a number of search functions. The search modulecan search the enterprise network and/or the enterprise communicationnetwork for content (e.g., documents, signifiers, and/or other relevantdata). The content searched for by the search module can be used by theextract module 330 to extract the plurality of signifiers, for instance.

A distance metric module 332 can include CRI that when executed by theprocessing resource 324 can perform a number of calculation functions.The distance metric module 332 can define a distance metric betweenpairs of related signifiers among the plurality of signifiers. Relatedsignifiers can include signifiers that have a co-occurrence on theenterprise network and/or the enterprise communication network.

The distance metric module 332 can include instructions to define adistance metric between pairs of related signifiers that includesinstructions to calculate a ratio of co-related services associated withboth a first signifier and a second signifier and services relatedindependently to the first signifier and services related independentlyto the second signifier; calculate a ratio of co-related phrasesassociated with both the first signifier and the second signifier andphrase related independently to the first signifier and to the secondsignifier; average a location of the first signifier and the secondsignifier on the enterprise network; and, define the distance metric asa sum of the ratio of co-related services, the ratio of co-relatedphrases, and the average location.

A build semantics graph module 334 can include CRI that when executed bythe processing resource 324 can perform a number of building graphfunctions. The build semantics graph module 334 can build a semanticsgraph using the defined distance metrics between pairs of relatedsignifiers, including the defined distance metric of the first signifierand the second signifier.

A memory resource 328, as used herein, can include volatile and/ornon-volatile memory, and can be integral, or communicatively coupled, toa computing device, in a wired and/or a wireless manner. The memoryresource 328 can be in communication with the processing resource 324via a communication path 326 local or remote to a machine (e.g., acomputing device) associated with the processing resource 324. Thecommunication path 326 can be such that the memory resource 328 isremote from the processing resource (e.g., 324), such as in a networkconnection between the memory resource 328 and the processing resource(e.g., 324).

The processing resource 324 coupled to the memory resource 328 canexecute CRI to extract a plurality of signifiers from an enterprisenetwork using an extraction tool. The processing resource 324 coupled tothe memory resource 328 can also execute CRI to define a distance metricbetween related signifiers among the plurality of signifiers, whereindefining a distance metric between each pair of related signifiersincludes: calculate a ratio of co-related services associated with botha first signifier and a second signifier and services relatedindependently to the first signifier independent and services relatedindependently to the second signifier; calculate a ratio of co-relatedphrases associated with both the first signifier and the secondsignifier and phrases related independently to the first signifier andphrases related independently to the second signifier; average alocation of the first signifier and the second signifier on theenterprise network; and define the distance metric as a sum of the ratioof co-related services, the ratio of co-related phrases, and the averagelocation. The processing resource 324 coupled to the memory resource 328can also execute CRI to build a semantics graph for the enterprisecommunication network using the defined distance metrics between pairsof related signifiers, including the defined distance metric of thefirst signifier and the second signifier.

As used herein, “logic” is an alternative or additional processingresource to execute the actions and/or functions, etc., describedherein, which includes hardware (e.g., various forms of transistorlogic, application specific integrated circuits (ASICs), etc.), asopposed to computer executable instructions (e.g., software, firmware,etc.) stored in memory and executable by a processor.

The specification examples provide a description of the applications anduse of the system and method of the present disclosure. Since manyexamples can be made without departing from the spirit and scope of thesystem and method of the present disclosure, this specification setsforth some of the many possible example configurations andimplementations.

What is claimed:
 1. A method for building a semantics graph for anenterprise communication network, comprising: calculating a distancemetric between a first signifier and a second signifier associated withthe enterprise communication network, wherein the distance metricincludes a plurality of relationships defined based on a frequency ofco-occurrences of the first signifier and the second signifier using acomputing device; and building a semantics graph for the enterprisecommunication network using the calculated distance metric.
 2. Themethod of claim 1, wherein calculating the distance metric includescalculating a weighted Euclidean distance including constructing ann-dimensional feature vector.
 3. The method of claim 1, wherein buildingthe semantics graph includes adding the first signifier and the secondsignifier as nodes on the semantics graph with an edge connecting thefirst signifier and the second signifier.
 4. The method of claim 3,including weighting the edge with a score defined by the calculateddistance metric.
 5. The method of claim 1, including searching anenterprise network for a plurality of signifiers, including words,phrases, and/or acronyms using a search tool.
 6. The method of claim 1,including calculating the distance metric based on a plurality ofcriteria.
 7. A non-transitory computer-readable medium storing a set ofinstructions executable by a processing resource, wherein the set ofinstructions can be executed by the processing resource to: calculate adistance metric between a first signifier and a second signifierassociated with an enterprise communication network, wherein thedistance metric includes a plurality of relationships defined based on:a frequency of co-related services associated with the first signifierand the second signifier; a frequency of co-related phrases associatedwith the first signifier and the second signifier; and an averagelocation of the first signifier and the second signifier on theenterprise network; and build a semantics graph for the enterprisecommunication network using the calculated distance metric.
 8. Themedium of claim 7, wherein the instructions executable by the processingresource include instructions executable to extract a plurality ofsignifiers from the enterprise network using an extraction tool, whereinthe plurality of signifiers include the first signifier and the secondsignifier.
 9. The medium of claim 8, wherein the instructions executableby the processing resource include instructions executable to calculatea distance metric for related signifiers among the plurality ofsignifiers, including the distance metric for the first signifier andthe second signifier.
 10. The medium of claim 7, wherein theinstructions executable by the processing resource include instructionsexecutable to: divide the frequency of co-related services associatedwith the first signifier and the second signifier by a frequency ofservices related independently to the first signifier and servicesrelated independently to the second signifier to determine a ratio ofco-related services of the first signifier and the second signifier; anddivide the frequency of co-related phrases associated with the firstsignifier and the second signifier by a frequency of phrases relatedindependently to the first signifier and phrases related independentlyto the second signifier to determine a ratio of co-related phrases ofthe first signifier and the second signifier.
 11. The medium of claim10, wherein the instructions executable by the processing resource tocalculate the distance metric between a first signifier and a secondsignifier include instructions executable to define the distance metricas a sum of the ratio of co-related services, the ratio of co-relatedphrases, and the average location.
 12. The medium of claim 7, whereinthe instructions executable by the processing resource to build thesemantics graph include instructions executable to add a first node, asecond node, and an edge to the semantics graph, wherein the first noderepresents the first signifier, the second node represents the secondsignifier, and the edge connects the first node and the second node. 13.The medium of claim 12, wherein the edge is weighted by the distancemetric.
 14. A system for building a semantics graph for an enterprisecommunication network comprising: a processing resource; and a memoryresource communicatively coupled to the processing resource containinginstructions executable by the processing resource to: extract aplurality of signifiers from an enterprise network using an extractiontool; define a distance metric between pairs of related signifiers amongthe plurality of signifiers, wherein defining a distance metric betweena first signifier and a second signifier includes: calculate a frequencyof co-related services associated with the first signifier and thesecond signifier; calculate a frequency of co-related phrases associatedwith the first signifier and the second signifier; average a location ofthe first signifier and the second signifier on the enterprise network;and define the distance metric as a sum of the frequency of co-relatedservices, the frequency of co-related phrases, and the average location;and build a semantics graph for the enterprise communication networkusing the defined distance metrics between pairs of related signifiers,including the defined distance metric of the first signifier and thesecond signifier.
 15. The system of claim 14, wherein related signifiersinclude signifiers that have a co-occurrence on the enterprise network.